Robust Similarity Measures for Named Entities Matching

نویسندگان

Erwan Moreau

François Yvon

Olivier Cappé

چکیده

Matching coreferent named entities without prior knowledge requires good similarity measures. Soft-TFIDF is a fine-grained measure which performs well in this task. We propose to enhance this kind of metrics, through a generic model in which measures may be mixed, and show experimentally the relevance of this approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of Similarity Measures for Template Matching

Image matching is a critical process in various photogrammetry, computer vision and remote sensing applications such as image registration, 3D model reconstruction, change detection, image fusion, pattern recognition, autonomous navigation, and digital elevation model (DEM) generation and orientation. The primary goal of the image matching process is to establish the correspondence between two ...

متن کامل

Semi-automatic Labeling of (Coreferent) Named Entities: An Experimental Study

In this paper, we investigate the problem of matching coreferent named entities extracted from text collections in a robust way: our longterm goal is to build similarity methods without (or with the minimum amount of) prior knowledge. In this framework, string similarity measures are the main tool at our disposal. Here we focus on the problem of evaluating such a task, especially in finding a m...

متن کامل

Robust, Light-weight Approaches to compute Lexical Similarity

Most text processing systems need to compare lexical units – words, entities, semantic concepts – with each other as a basic processing step within large and complex systems. A significant amount of research has taken place in formulating and evaluating multiple similarity metrics, primarily between words. Often, such techniques are resourceintensive or are applicable only to specific use cases...

متن کامل

Chinese Entity Relation Extraction Based on Word Co-occurrence

Chinese entity relation extraction is a part of entity relation extraction. According to entity relation extraction technology and the features of Chinese news corpus, this paper proposes a novel method for Chinese entities relation extraction. The method, named WCORE (word co-occurrence relation extraction), first measures the semantic similarity by word co-occurrence and then adopts pattern m...

متن کامل

Mining Document Collections to Facilitate Accurate Approximate Entity Matching

Many entity extraction techniques leverage large reference entity tables to identify entities in documents. Often, an entity is referenced in document collections differently from that in the reference entity tables. Therefore, we study the problem of determining whether or not a substring “approximately” matches with a reference entity. Similarity measures which exploit the correlation between...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Robust Similarity Measures for Named Entities Matching

نویسندگان

چکیده

منابع مشابه

Evaluation of Similarity Measures for Template Matching

Semi-automatic Labeling of (Coreferent) Named Entities: An Experimental Study

Robust, Light-weight Approaches to compute Lexical Similarity

Chinese Entity Relation Extraction Based on Word Co-occurrence

Mining Document Collections to Facilitate Accurate Approximate Entity Matching

عنوان ژورنال:

اشتراک گذاری